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Abstract 

The  Hansen  and  Lebedeff  data  set  on  global  surface  air  temperature  change 
is  reanalyzed  using  smoothing  splines  designed  to  estimate  the  conditional 
quantile  functions  of  global  temperature  over  the  last  century.  It  is  assumed 
only  that  the  quantiles  are  smooth  functions  of  time.  The  smoothness  of  the 
fitted  quantile  functions  is  determined  by  a  data  driven  version  of  the  Schwarz 
criterion.  The  estimates  offer  statistical  evidence  of  a  break  in  the  generally 
upward  sloping  trend  of  the  temperature  series  during  a  "cooling  period"  from 
1940  to  1965. 

1      Introduction 

There  has  been  considerable  recent  interest  in  analyzing  global  temperature  data 
and  controversy  concerning  the  detection  of  a  warming  trend.  In  this  paper  we 
reconsider  the  widely  studied  annual  data  set  developed  by  Hansen  and  Lebedeff  [5], 
updated  in  [6],  and  in  [3]  for  the  period  1880-1990.  Looking  at  the  data,  it  is  obvious 
that  the  temperatures  of  recent  years  are  higher  than  those  of  the  1880s.  Hansen 
and  Lebedeff  distinguished  three  periods:  increasing  temperature  between  1880 
and  1940,  decreasing  temperature  between  1940  and  1965,  and  a  second  warming 
period  after  1965.  It  is  far  from  clear,  however,  whether  the  global  temperatures 
over  this  period  increased  due  to  the  natural  variability  of  the  climate  system  and 
hence  may  decrease  during  the  next  century  for  the  same  reason,  or  whether  they 
are  likely  to  continue  to  increase  due  to  some  deterministic  process.  This  issue  is 
obviously  very  complex,  since  surface  air  temperature  is  influenced  by  many  factors 
including  greenhouse  gases,  volcanic  activity,  the  ocean,  clouds,  solar  activity,  and 
urbanization. 


We  will  not  attempt  to  analyze  causalities  between  global  temperature  changes 
and  the  enumerated  influences.  We  will  simply  consider  the  temperature  data  as  a 
random  process  {yt}  having  the  time  dependent  conditional  distribution  function 
Ft(.).  A  rationale  for  this  approach  is  the  idea,  that  long-term  climatic  changes  de- 
termine the  shape  and  the  location  of  the  distribution  function  Ft(.)  and  short-term 
fluctuations  occur  due  to  a  random  process,  which  reflects  the  natural  variability 
of  the  climatic  system.  Several  authors  have  recently  suggested  statistical  models 
for  the  Hansen  and  Lebedeff  global  temperature  data.  These  models  may  all  be 
viewed  as  special  cases  of  the  simple  form  yt  =  9{t)  +  ut  with  ut  where  the  term 
9(t)  may  be  regarded  as  a  deterministic  trend,  and  the  ut  is  a  stochastic  term  which 
is  assumed  to  follow  an  ARIMA  process.  Estimating  such  models  by  conventional 
least-squares  methods  yields  a  forecasting  model  for  the  conditional  mean  of  yt. 
Galbraith  and  Green  [4]  consider  the  hypothesis  of  a  unit  root  in  the  autoregres- 
sive  polynomial  of  the  ut  process,  but  reject  the  hypothesis  for  monthly  data  in 
favor  of  a  deterministic  trend  formulation.  Nevertheless,  as  noted  in  Bassett  [1], 
serial  correlation  is  suggested  by  a  variety  of  mechanisms  that  could  plausibly  affect 
year-to-year  changes  in  global  climate:  thermal  feedback  between  the  ocean  and 
the  atmosphere,  sunspots,  or  volcanic  activity.  Bloomfield  [2]  considers  the  broader 
class  of  ARIM A(p,  d,  q)  models. 

In  this  paper,  we  employ  a  nonparametric  approach  to  the  analysis  of  global 
temperature  and  depart  from  the  exclusive  focus  on  models  for  conditional  expec- 
tations. Instead  of  estimating  an  equation  for  E(yf),  we  directly  estimate  models 
for  the  quantiles  F^1(t)  of  the  underlying  conditional  density  function  Ft(.).  To 
avoid  an  a  prion  specification  of  the  parametric  structure  of  the  process  we  assume 
only  that  F^l{r)  is  a  smooth  function  of  the  time  index  t.  In  contrast  to  nonpara- 
metric methods  designed  to  estimate  models  for  the  conditional  mean  function,  the 
quantile  methods  provide  a  degree  of  robustness  with  respect  to  heterogeneity  of 
the  stochastic  component  of  global  temperature  and,  especially,  with  respect  to 
outlying  observations. 


2      Quantile  Smoothing  Splines 


The  methods  used  in  this  paper  are  based  on  quantile  regression  ideas  introduced 
in  Koenker  and  Bassett  [8].  In  their  simplest  parametric  form  one  may  consider 
the  problem  of  estimating  a  vector  of  unknown  regression  parameters,  /?  from  a 
sample  of  independent  observations  on  the  random  variables  j/i , . . . ,  yn  distributed 
according  to  P(yt  <  Y)  =  Ft(Y\xt)  with  (t  =  1, . .  .,n),  where  x\  denotes  a  row  of  a 
known  n  x  (p+  1)  design  matrix.  Assuming  that  the  conditional  quantile  functions 
of  Y  are  linear  in  xt,  i.e.  that  there  exists  a  vector  0  such  that 

i-l 


Ft-1(r\X  =  x't)  =  x't(3,  (1) 


the  Tth  regression  quantile  is  denned  as  any  solution  to  the  minimization  problem: 

(2) 


min 

b€Rk+l 


J2        T\yt-x'tb\+         £        (I  -  r)\yt  -  x'tb\ 
<€{«|y«>*',*}  t€{t\yt<x'tb} 


The  least  absolute  error  estimator  is  the  regression  median,  i.e.  the  r  =  1/2  quantile. 
Equation  (2)  can  be  rewritten  more  concisely  as 

n 

where  pT(u)  =  u(t  —  I(u  <  0))  and  /(.)  denotes  the  indicator  function.  Of  course, 
the  estimation  of  regression  quantiles  according  to  Equation  (3)  presumes  a  linear 
functional  relationship  between  dependent  and  independent  variables.  As  suggested 
above,  we  will  consider  a  more  general,  nonparametric  approach  to  estimating  con- 
ditional quantile  models  based  on  smoothing  splines. 

Smoothers  are  an  extremely  large  class  of  estimation  methods  designed  to  sum- 
marize the  dependence  of  a  response  measurement  {yt}  as  a  function  of  one  or  more 
predictor  measurements.  Since  reasonable  predictions  are  less  variable  than  the  re- 
sponse measurement  itself,  the  procedure  may  be  viewed  as  smoothing  the  original 
observations.  A  moving  average  is  an  example  for  a  simple  smoother.  Another  class 
of  smoothers  is  defined  by  the  following  optimization  problem:  among  all  functions 
g(x)  with  two  continuous  derivatives,  find  the  one  that  minimizes  the  penalized 
residual  sum  of  squares 


f>t  -  <7(*<))2  +  A  /  (g"(x)fdx 
(=1  Ja 


(4) 


where  A  is  a  fixed  constant,  and  a  <  X\  <  . . .  <  xn  <  b.  The  minimizer  of  (4)  is 
a  natural  cubic  spline,  hence  the  smoother  is  called  smoothing  spline.  Hastie  and 
Tibshirani  [7]  provide  an  excellent  recent  treatment  of  this  approach. 

Koenker  et  al.  [9]  have  suggested  a  modification  of  Equation  (4),  replacing  the 
fidelity  term  with  the  quantile  fidelity  criterion  pT()  and  the  Li  roughness  penalty 
by  either  an  L\  or  L^  penalty  term.  The  modified  penalty  has  the  important 
advantage  that  it  maintains  the  linear  programming  formulation  of  the  solution 
algorithm.  Indeed  they  consider  a  broader  class  of  quantile  smoothing  splines  which 
minimize 


R[9rA  =  £>(l*  ~  9r,x(Xt))  +  A(  /     KA(x)|"dx)] 


for  appropriate  choice  of  Q.  For  p  =  1  solutions  are  linear  splines  with  knots  on 
the  mesh  0  =  xq  <  x\  <  . . .  <  xn  <  xn+i  =  1-  For  p  =  oo  solutions  are  quadratic 
splines,  but  the  knot  selection  is  somewhat  more  complicated.  For  both  choices  of 
p  the  problem  can  be  solved  by  elementary  linear  program  techniques. 


The  choice  of  A  determines  the  smoothness  of  the  fitted  curve.  As  A  tends  to  zero 
the  penalty  term  becomes  irrelevant  and  the  result  is  a  spline  which  interpolates 
{yt}  at  the  design  points  Xi, . .  .,xn.  When  there  are  repeated  design  observations, 
the  chosen  quantiles  at  the  distinct  design  points  are  interpolated.  As  A  grows  to 
infinity,  the  second  derivative  of  the  estimated  function  gT\(xt)  is  forced  to  be  zero 
and  the  optimization  corresponds  to  the  simple  linear  quantile  regression  problem. 
Here  since  the  number  of  interpolated  observations,  denoted  below  by  k(\),  provides 
a  natural  measure  of  the  dimensionality  of  the  fit,  it  is  possible  to  use  familiar  model 
selection  criteria  as  suggested  by  Schwarz  [11].  Hastie  and  Tibshirani  present  a 
detailed  discussion  of  these  ideas  in  the  context  of  classical  least-squares  smoothing 
splines  in  [7]. 

Since  the  correct  dimensionality  of  the  underlying  model  is  unknown,  a  data 
driven  procedure,  which  provides  statistical  evidence  for  the  choice  of  A,  is  desirable. 
To  select  one  among  a  number  of  models  of  differing  dimensionality  Schwarz  [11] 
proposes  the  following  procedure:  choose  the  model  j  for  which 

k 
-logMJ(yi,...,y„)+  ylogn  (6) 

is  smallest.  Mj  denotes  the  maximized  likelihood  for  model  j  and  kj  its  dimension- 
ality. To  determine  Mj  for  the  median  regression  case,  we  may  interpret  the  fidelity 
as  the  log  likelihood  for  the  Laplace  density 

/(O/ -/<)/*)  =  ^e-|^l.  (7) 

The  log-likelihood  function  is  thus  defined  by 

n 

yt  -  n 


f(yi,.-.,2/n)  =  5^1og/((yi  -/*)/*)  =  -nlog<r-nlog2-£ 

t=\  <=i 

Substituting  the  maximum  likelihood  estimators  one  obtains 


(8) 


logM  =  -nlog  I  -J2\Vt  -/'I  J  ~c(n)  (9) 

where  c(n)  is  a  constant  depending  only  on  the  number  of  observations  n.  For  the 
maximization  of  Equation  (6)  we  need  only  consider  the  first  term  of  (9)  and  can 
define  the  information  criterion 

SIC(X)  =  log  (  -  V  lift  -  9r,x(xt)\)  +  ^  logn,  (10) 

which  may  be  minimized  over  A.  It  is  important  to  observe  that  for  both  penalties 
it  is  possible  to  compute  the  entire  solution  path  in  the  bandwidth  parameter  A 
by  parametric  linear  programming  methods,  and  consequently  we  need  not  restrict 
the  solutions  to  an  arbitrary  grid  as  is  typically  done  in  the  case  of  least-squares 
smoothing  splines. 


3      Empirical  Results 

We  apply  the  quantile  smoothing  spline  approach  to  the  data  set  on  global  temper- 
ature changes  by  Hansen  and  Lebedeff.  In  this  paper,  the  parameter  A  is  selected 
for  all  quantiles  by  minimizing  SIC(X)  for  the  median  smoothing  spline.  In  other 
applications,  different  quantiles  may  require  rather  different  degrees  of  smoothness, 
but  in  the  present  case  this  appears  to  be  unnecessary.  In  Figures  1  and  2  we  plot 
the  information  criterion  for  the  L\  smoothing  splines  over  the  range  of  (0.25,  250) 
and  for  the  Loo  smoothing  splines  over  the  interval  (20,50000).  For  the  L\  fit  the 
range  includes  the  extreme  cases  of  two  and  n  interpolated  yt's,  for  the  Loo  fit 
the  dimensionality  lies  between  three  and  n.  However,  we  calculated  SIC(X)  for 
A  >  50000  such  that  k  =  2,  but  did  not  obtain  an  additional  minimum.  For  the 
L\  splines  SIC(X)  has  a  global  minimum  at  A  =  26.  The  effective  number  of  pa- 
rameters is  k  —  5.  For  the  Loo  estimation  SIC(X)  has  its  minimum  at  A  =  1000 
with  the  dimensionality  k  =  6.  The  estimated  quantiles  r  =  0.1,0.25,0.5,0.75,0.9 
for  the  selected  models  are  plotted  in  Figures  3  and  4. 

As  pointed  out  in  Section  2,  the  L\  smoothing  splines  are  piecewise  linear.  The 
L\  fit  to  the  Hansen-Lebedeff  data  divides  the  time  scale  very  clearly  into  the  three 
periods  mentioned  in  the  introduction:  warming  until  the  late  1930s,  a  period  of 
almost  stable  or  slightly  decreasing  temperature  until  the  mid  1960s  and  a  second 
warming  period  up  to  the  present.  In  the  warming  period  the  density  function 
shifts  almost  linearly  upwards.  In  the  first  period  the  median  moves  towards  the 
0.25  quantile  until  1910  and  afterwards  back  again  so  that  at  the  end  of  the  first 
period  the  distribution  is  almost  symmetric.  In  the  cooling  period,  particularly  the 
0.25  and  0.75  quantiles  are  moving  downwards.  The  whole  density  in  the  third 
period,  however,  shifts  upwards  again.  In  contrast  to  the  L\  splines,  the  Loo  splines 
are  smoother  since  the  penalty  imposes  a  global  upper  bound  on  g"  A.  Although 
the  transitions  between  the  different  periods  are  not  as  sharp  as  in  the  Li  case, 
the  fitted  function  can  again  be  divided  in  the  three  phases:  warming  -  cooling  - 
warming. 

Galbraith  and  Green  [4]  have  pointed  out  that  by  applying  a  Dickey  and  Fuller 
type  of  test  (ADF)  to  monthly  data  the  hypothesis  of  a  unit  root  has  to  be  rejected 
against  the  alternative  of  a  trend  stationary  process.  We  applied  a  similar  test  to 
the  annual  data  set  and  cannot  reject  the  hypothesis  of  a  unit  root.  Consequently, 
we  have  also  estimated  conditional  quantiles  for  the  first  differences  {Ay<}  under 
L\  and  Lqo  penalty.  The  Schwarz  Information  Criterion  suggests  in  both  cases 
to  choose  A  =  oo,  i.e.  to  approximate  the  conditional  quantiles  by  a  linear  model. 
The  estimated  conditional  median  is  essentially  zero  and  the  first  differences  appear 
to  be  almost  identically  distributed.  However,  as  Perron  [10]  has  emphasized  the 
presence  of  structural  breaks  tends  to  substantially  reduce  the  power  of  conventional 
unit  root  tests.  Consequently,  we  should  not  regard  the  ADF  results  as  conclusive 
findings  for  the  unit  root  hypothesis. 
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Figure  1:  Schwarz  Information  Criterion  for  L\  Quantile  Smoothing  Splines. 
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Figure  2:  Schwarz  Information  Criterion  for  Loo  Quantile  Smoothing  Splines. 
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Figure  3:  Li  Quantile  Smoothing  Splines  [0.1,  0.25,  0.5,  0.75,  0.9]  for  A  =  26. 
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Figure  4:  L^  Quantile  Smoothing  Splines  [0.1,  0.25,  0.5,  0.75,  0.9]  for  A  =  1000. 


4      Conclusion 

Two  breaks  in  the  upward  trend  in  temperature  suggested  by  the  conditional  quan- 
tile  estimates  contrast  with  the  findings  of  Galbraith  and  Green  [4]  and  Solow  [12]. 
Galbraith  and  Green,  rejecting  the  unit  root  hypothesis  for  monthly  global  tem- 
perature, conclude  that  a  deterministic  linear  trend  provides  the  preferable  model. 
Solow  tests  for  a  single  structural  break  in  a  two-phase  linear  trend  model,  but 
concludes,  using  data  from  the  southern  hemisphere,  that  a  single  break  can  be 
rejected  in  favor  of  the  linear  trend  model.  Using  a  more  flexible  approach,  which 
permits  several  possible  structural  breaks,  the  present  approach  suggests  that  the 
periodization  originally  proposed  by  Hansen  and  Lebedeff  is  reasonable.  However, 
the  problem  of  distinguishing  the  unit  root  model  from  the  broken  trend  suggested 
here  remains  a  pressing  open  research  problem. 
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